Cepstral analysis of speech signals in the process of automatic pathological voice assessment Anna
نویسنده
چکیده
The paper describes the problem of cepstral speech analysis in the process of automated voice disorder probability estimation. The author proposes to derive two of the most diagnostically significant voice features: quality of harmonic structure and degree of subharmonic from cepstrum of speech signal. Traditionally, these attributes are estimated auricularly or by spectrum (or spectrogram) observation, hence this analysis often lacks accuracy and objectivity. The introduced parameters were calculated for the recordings from Disordered Voice Database (Kay, model 4337 version 2.7.0) which consists of 710 voice samples (657 pathological, 53 healthy) recorded in the laboratory environment and described with diagnosis and a number of additional attributes (such as age, sex, nationality). The proposed cepstral voice features were compared to similar voice parameters derived from Multidimensional Voice Program (Kay, model 5105 version 2.7.0) in respect to their diagnostic significance and presented graphically. The results show that cepstral features are more correlated with decision and better discriminate clusters of healthy and disordered voices. Additionally, both parameters are obtained by single cepstral transform and do not require to perform F0 tracking earlier as it is derived simultaneously.
منابع مشابه
Cepstral analysis of speech signals in the process of automatic pathological voice assessment
The paper describes the problem of cepstral speech analysis in the process of automated voice disorder probability estimation. The author proposes to derive two of the most diagnostically significant voice features: quality of harmonic structure and degree of subharmonic from cepstrum of speech signal. Traditionally, these attributes are estimated auricularly or by spectrum (or spectrogram) obs...
متن کاملThe Study of Vocal Function in Patients With Early Laryngeal Carcinoma After Transoral Laser Microsurgery
Objective Today transoral laser microsurgery is considered as one of the first options to control early laryngeal cancer, and voice disorder is one of the inevitable complications of this therapeutic component. This study aimed to compare the vocal function in patients with early-stage laryngeal cancer following laser surgery with healthy individuals with normal voice quality using acoustic ana...
متن کاملA New Algorithm for Voice Activity Detection Based on Wavelet Packets (RESEARCH NOTE)
Speech constitutes much of the communicated information; most other perceived audio signals do not carry nearly as much information. Indeed, much of the non-speech signals maybe classified as ‘noise’ in human communication. The process of separating conversational speech and noise is termed voice activity detection (VAD). This paper describes a new approach to VAD which is based on the Wavelet ...
متن کاملArtificial Neural Network Based Pathological Voice Classification Using Mfcc Features
The analysis of pathological voice is a challenging and an important area of research in speech processing. Acoustic voice analysis can be used to characterize the pathological voices with the aid of the speech signals recorded from the patients. This paper presents a method for the identification and classification of pathological voice using Artificial Neural Network. Multilayer Perceptron Ne...
متن کاملVoice-based Age and Gender Recognition using Training Generative Sparse Model
Abstract: Gender recognition and age detection are important problems in telephone speech processing to investigate the identity of an individual using voice characteristics. In this paper a new gender and age recognition system is introduced based on generative incoherent models learned using sparse non-negative matrix factorization and atom correction post-processing method. Similar to genera...
متن کامل